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Abstract 

yambo is an ab initio code for calculating quasiparticle energies and optical properties of electronic systems within the 
framework of many-body perturbation theory and time-dependent density functional theory. Quasiparticle energies 
are calculated within the GW approximation for the self-energy. Optical properties are evaluated either by solving 
the Bethe-Salpeter equation or by using the adiabatic local density approximation, yambo is a plane-wave code 
that, although particularly suited for calculations of periodic bulk systems, has been applied to a large variety of 
physical systems, yambo relies on efficient numerical techniques devised to treat systems with reduced dimensionality, 
or with a large number of degrees of freedom. The code has a user-friendly command-line based interface, flexible 
I/O procedures and is interfaced to several publicly available density functional ground- state codes. 
PACS: 71.35.-y, 71.15.-m, 71.45.Gm, 71.15.Qe 
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PROGRAM SUMMARY 

Program title: yambo 
Journal Reference: 
Catalogue identifier: 

Program obtainable from: http: / /www.yambo-code.org| 

Licensing provisions: This program is distributed under the GNU General Public License v2.0 ( see |http: //www.gnu.org/| 
for details) 

Programming language: Fortran 95, C 

Computer (s) for which the program has been designed: any computer architecture, running any flavor of UNIX 
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Operating system(s) under which the program has been tested: GNU/Linux, AIX, Irix, OS/X 
RAM required to execute program with typical data: 10-1000 Mbytes 
Number of processors used: Up to 100 

Supplementary material: Manuals, tutorials, background theory and sample input files are available for download or inspection 
from the website http:/ /www.yambo-code.org| 

Keywords: many-body perturbation theory, density functional theory, time-dependent density functional theory, self-energy, 
Bethe-Salpeter equation, excitons, quasiparticles, plasmons 

PACS: 71.35.-y 71.15.-m, 71.45.Gm, 71.15.Qe 

Classification: 7.3 Electronic Structure, 4.4 Feynman Diagrams, 7.2 Electron Spectroscopies, 18 Optics 
External routines/libraries: 

■ BLAS (htt p://www.netlib.org/ blas/) 

• LAPACK (|http:// www.netlib.org/lapack/1 

• MPI (www-unix.mcs.anl.gov/mpi/) is optional. 

• BLACS ( ^http://www. netl ib.org/scalapack/ ) is optional. 

• SCALAPACK l|http: //ww w. netlib.org/scalapack/ ) is optional. 

• FFTW (http:/7www.fftw.org/ ) is optional. 

• netCDF {www.unidata.ucar.edu/software/netcdf/) is optional. 
Nature of problem: 

Calculation of excited state properties (quasiparticles, excitons, plasmons) from first principles. 
Solution method: 

Many body perturbation theory (Dyson equation, Bethe-Salpeter equation) and time-dependent density functional theory. 
Quasiparticle approximation. Plasmon-pole model for the dielectric screening. Plane— wave basis set with norm conserving 
pseudopotentials . 

Unusual features: 

During execution, ycimbo supplies estimates of the elapsed and remaining time for completion of each runlevel. Very friendly 
shell-based user-interface. 

Additional comments: 

yambo was known as "SELF" prior to GPL release. It belongs to the suite of codes maintained and used by the European 
Theoretical Spectroscopy Facility (ETSF) [l] 

Running time: 

The typical yambo running time can range from a few minutes to some days depending on the chosen level of approximation, 
and on the property and physical system under study. 

LONG WRITE-UP 



1. Introduction 

Ah initio calculations in the framework of density functional theory (DFT) [5] have yielded high-quality 
results for a large variety of systems, ranging from periodic solids to molecules and nanostructures [2]. These 
results are however mostly limited to quantities related to the electronic ground state^ whereas additional 
phenomena that occur in the excited state are not correctly described [3]. 
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It was recognized at an early stage [1] that in extended systems the standard approximations for DFT - 
the local density (LDA) or generalized gradient (GGA) approximations - fail to describe, among other effects, 
the band gap of insulators and semiconductors [3]. In contrast, many-body perturbation theory (MBPT) [5] 
provides, by means of the quasiparticle (QP) concept, a more adequate and accurate approach that yields 
band gaps (and band structures, in general) in good agreement with the experimental values [31 ■ 

While the first successful QP calculations were performed in the mid 80 's 6 , it was only in the late 
90 's that many-body effects have been included in the ab initio calculation of optical properties of real 
materials |7]. It is now well known that a quantitative description of the optical response of an interacting 
electron system must account for electron-hole interactions (excitonic effects) [3 . This is achieved by solving 
the Bethe-Salpeter (BS) equation for the electron-hole Green's function, within the MBPT framework. 
Nowadays, solving the BS equation is the state-of-the-art approach for calculating optical properties in 
extended systems. The importance of including excitonic effects is clear on comparison with the experimental 
absorption spectra of semiconductors and insulators. In particular, for wide-gap insulators there is hardly 
any resemblance between the spectrum calculated within a noninteracting theory and the experiment. 

An alternative approach to the study of correlation in many-body systems is given by time dependent 
DFT (TDDFT) [8]. Similar to the paradigm of DFT for ground-state properties, TDDFT has emerged as a 
very powerful tool for the description of excited states. In principle TDDFT is an exact theory for neutral 
excited state properties. Nevertheless, in practice it has a number of commonly cited failures related to 
the approximation of the exchange-correlation (xc) kernel. One example is the difficulty encountered when 
studying the optical properties of extended systems; another is the severe underestimation of high-lying 
excitation energies in molecules. On the other hand, the combination of TDDFT with simple approximations 
for the xc kernel (based on the homogeneous electron gas) has been successfully applied to the study of the 
optical response of molecules and nanostructures. 

Quasiparticles, excitons and plasmons are the excitations that can be calculated using the yambo code. 
These excitations are ubiquitous in the ab initio description of the electronic and optical properties of any 
physical system. The ysunbo code uses as input the result of standard DFT calculations obtained by means 
of publicly available codes |9|10| . The theoretical tools implemented in yambo are TDDFT and the BS 
equations for the response function and the Dyson equation in the GW approximation for the QPs. 

The paper is structured as follows. In Section [2] we introduce the most important theoretical concepts as 
they are utilized in ysunbo, before describing some of the most important numerical algorithms implemented 
in the code in Section[3l Section[4]outlines the structure and capabilities of yaunbo. A more detailed description 
of how the code is actually utilized is presented in Section [3 before some brief notes regarding installation 
in Section [HI Finally, an illustrative example of a typical yamibo calculation is outlined in Section [T] 



2. Theoretical background 

2.1. Quasiparticles: the plasmon-pole approximation 

MBPT is a rigorous approach based on the Green's function method, and provides a proper framework 
for accurately computing excited state properties. For details of the Green's function formalism and many- 
body techniques applied to condensed matter, we refer the reader to several comprehensive papers in the 
literature |5lllj . Here we shall just present some of the main equations used for the quasiparticle and optical 
spectra calculations. 

The basic component of a many-body perturbative expansion is the reference noninteracting system, that 
in yambo is represented by the solution of the DFT Kohn-Sham (KS) equations. In the following we will 
label these single particle levels as |nk), n being the band index and k the generic vector of the grid used 
to sample the Brillouin Zone (BZ). In this basis the noninteracting Green's function G" takes the form 
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fnk being the occupation factor and £„k the KS energies. The basic relation between G'^ and the exact 
Green's function is given by the Dyson equation 

-1 



(2) 



where the contribution due to the DFT exchange-correlation potential is removed from the single- 
particle energies appearing in G^^ (t^) in order to prevent double counting of correlation effects induced by 
the self-energy E. Since the basic physical process that distinguishes a bare particle from a quasiparticle 
is the screening of the particle by means of the polarization of the surrounding medium, yambo uses the 
GW approximation for the electronic self-energy S [12] which is diagrammatically depicted in Fig. [TJ In 
this approximation the self-energy is a function of G*^ and of the inverse dynamical dielectric function 
e'^ (ri,r2;a;), and it is composed of an exchange (x), and of a correlation (c) part, 

E„kM = E?Jk + EJ=.kH- (3) 
The exchange part is simply the Fock term of the Hartree-Fock self-energy, and it can be rewritten as 



SJJk=(^ik|S-(ri,r2)|nk> 



E 



BZ 



(2^)3 2^ 



v{q + G) |p„„(k,q, G)p/m(k 



(k-q), 



(4) 



where p„„i(k, q, G) = (nk|e'('i+'^) '')|TOk — q), G are the reciprocal lattice vectors, and w (q + G) = 47r/|q- 
Gp. The correlation part of the self-energy is given by 



S^,kM = ("k|S^(ri,r2;(^)|nk>=i^ / 

™ Je 



G,G' 



in 



(k,q,G)p (k,q, G')x 



dw'G^k-q - ^GG' (q. ■ (5) 



The energy integral entering Eq. ([5]) can be solved once the inverse dielectric function is known. The equation 
of motion for e^^ follows from that of the reducible response function x [OJ as 



ecG' (i' ^) = ^GG' + w (q + G) XGG' (q, t 



(6) 



The GW approximation for the self-energy is obtained when x is calculated within the random phase 
approximation (RPA) [11] 



G 



, (q,t^) ■ 



XGG' (q, w) = ^GG" - ^^(q + g")xgg" (q, ^ 

The noninteracting response function is easily calculated in terms of the bare Green's function Gq: 

dk 

^Pn'nk (q. G) Pn'nk (q, G') /„k-q (1 - fn'k) X 

1 1 



(7) 



XGG' (q,^)=2^ / 
„„' Jbz 



(27r)3 



(8) 



As the numerical integration of in Eq.((5]) would require the inversion of Eq.([7]) for many frequency 
points, yarnibo adopts the plasmon-pole approximation (PPA) for the GW self-energy [12]. In the PPA the 
e^^ function is approximated with a single pole function 



Cgg' (q' ^) ~ ^GG' + Rgg' (q) (t^ - f^GG' (q) + iO^ 



{lu + ncG' (q) - 



(9) 



and the residuals Rgg' and energies f^GG' are found by imposing the PPA to reproduce the exact 
function at = and lo = iEppA , with -BppA being a suitable user-defined parameter. 
Using Eq. [2]and assuming /„k to be either 1 or we have that: 



{lj - e„k) G„k (c^) = 1 + [S„k (^) - Kfk] Gnk (c^) . 



(10) 
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Fig. 1. Diagrammatic representation of the GW approximation for the self-energy operator. 

The key approximation is now to take the first order Taylor expansion of the self-energy around e„k (Newton 
approximation) in order to get 
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with 



E'^^ — e„k + ^nk [S„k (^nk) ^ Kfk] J 



■^nk — 



1 - 



dio 



H -1 



(11) 

(12) 
(13) 



Eqs. ([T^HT5|) constitute the QP approximation [T^ . 

It is important to note, at this stage, that by including explicitly the electronic occupations /„k, yambo can 
be equally applied to semiconductors, insulators and metals. In the latter case, however, the plasmon-pole 
approximation can be questionable when some of the valence orbitals are spatially localized (like in d or / 
metals) [13]. Nevertheless for metals in general the RPA is an excellent approximation to the calculation 
of optical properties, as the efficient screening occurring at the Fermi surface prevents the formation of 
excitonic states. 



2.2. Optical properties: the Bethe-Salpeter equation 

The evaluation of the response function x rnakes it possible to calculate the macroscopic dynamical 
dielectric function ej\/ and polarizability a. In particular the macroscopic dielectric function is defined in 
terms of the microscopic inverse dielectric function as |14| 

CM (w) = lim 1. , (14) 

I Jg=OG'=0 

where e is the matrix in the space of reciprocal vectors G defined in Eq. ([5]). Equation (|14p implies that, 
in general, one cannot take the simple spatial macroscopic average of the dielectric function 14J since the 
charge redistribution induced by the interaction with light induces, in turn, the formation of local microscopic 
fields — the local field effects. Such effects are particularly important in nanoscale materials where confinement 
induces the formation of microscopic fields that counteract the external applied perturbation 'W . Similarly 
the dynamical polarizability of a zero-dimensional electronic system is defined as 

"(t^) = -7^ lim ^XG=oG'=o (q,'^) , (15) 
47r q^o 

where is the unit cell volume. 
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Fig. 2. The calculated absorption spectra of solid Si02 within the RPA and through solving the BS equation are compared 
with the experimental curve. Calculations taken from Ref. |15j . experiment from Ref. |16) . 



The RPA to Eqs. (|14ll5p is often inadequate to describe the electronic correlations occurring in the response 
function. In practical applications the RPA does not yield optical absorption spectra in good agreement with 
experiments for several insulating and metallic systems [3]. For example, in the case of Si02, this discrepancy 
has led to extensive debates over the past forty years about the nature of the four well-defined peaks observed 
in the experiment (See Fig. [2]). The reason for the poor performance of the RPA is that the response function 
X measures the change in the electronic density induced by the external applied potential. In a noninteracting 
system the RPA for x is exact, but self-energy corrections modify the electronic density and, consequently, 
the RPA approximation is not valid anymore. Therefore, an estimation of the importance of corrections to 
the RPA can be obtained by looking at the value of the gap correction. The larger the gap, the less adequate 
is the RPA. The gap of Si02, indeed, is ~ 10 eV and the RPA is not even qualitatively correct. 

The drawbacks of the RPA are solved by using a more elaborate equation of motion for x that takes 
into account the effect of electron-electron correlations. This is the BS equation [TT] that can be introduced 
by using the electron-hole (e-h) Green's function L. First we note that the noninteracting q = response 
function, Eq.®, can be rewritten in terms of the noninteracting e-h Green's function L'^: 



lim Xgg' (q, = -« XI [^"'"k (q^ G) p„'„k (q, G')] ^L'k 

q— »0 ' q^O 



(16) 



nn'k 



To avoid the inversion of Eq. (fT4|) . we define a new interacting polarization [3| such that eM {^t^) = 1 
V (9) Xg=og'=o (qj<^)- This function defines a corresponding e-h Green's function L: 



hm XGG' (q, t^) = V V hm [p;,„k (q, G) p„i',„k' (q, G')] L „„/k (^) 
q^o -^-^ q^o 



(17) 



nn'k mm'k' 



The BS equation is an equation for L, obtained by performing a second iteration of Hedin's equation [TT] 



L 



mm k 



[CO) 



Lnn'U (^) 



'5nm^n'ni"5kk' ^ ^nn'k-^ ss'ki i^) 

ss'ki **'ki mm'k' 



(18) 



The BS equation naturally takes into account the electron-hole interaction in the response function as shown 
schematically in Fig. [3l The matrix S„„/ij = VF„„'k — 2V„„'k is the kernel of the BS equation composed of a 

ss ki ss ki ss ki _ _ 

direct electron-electron scattering term (W) plus an exchange interaction {V). Both W and V are integrals 
of Bloch functions 
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Fig. 3. Diagrammatic representation of the BS equation. 
J2 /^'- (k, q = k - ki , G) p;,,, (ki , q = k - ki , G') ea\.,v (q + G') , 



GG' 



Vnn'k = E q = 0, G) pL, (ki, q = 0, G) V (G) . 



(19) 
(20) 



G^O 



It is important to observe that the BS equation is a Dyson-hke equation that should be inverted at each 
frequency point. The dimension of the L matrix can be, however, quite large and the inversion of Eq. ljlSp 
practically impossible. However, for systems with a gap and at zero temperature the noninteracting e-h 
Green's function can be rewritten as 



[LL,k (^)] = ^- 



(21) 



^ — Erik + En'k 

As a consequence it can be shown [3] that the BS equation can be reduced to an eigenvalue problem of the 
Hamiltonian H, 



H 



nn k 
mm'k' 



(^^nk ^n'k) ^nm^n'm' ^] 



kk' 



ifn 



fn 



2V 



nn k * ' nn k 
mm'k' mm'k' 



(22) 



The Hamiltonian in Eq. (j22p is in general non-Hermitian. Nevertheless yaunbo adopts the standard Tamm- 
Dancoff approximation \17\ , in which only e-h pairs at positive energy are considered and the Hamiltonian H 
is Hcrmitian. Finally the dielectric function can be expressed in terms of the eigenstates |A) and eigenvalues 
Ex of H: 



ejvf (oj) = 1 — lini 



Stt 



q^o \q\^nNy 



''"'"k (q> G) p„',„k' (q, G') Y 



n'nk \ m'mk'/ 



nn'k m^n'k' 



(23) 



with = (ri'rtk|A) being the eigenvectors of H. 

In the case of semiconductors the BS approach induces only a minor modification of the absorption 
spectrum. For wide-gap insulators, instead, the energies Ex may fall within the single particle gap. In this 
case the eigenstates of the BS equation are called bound excitons [3]. A typical example is shown in Fig. [2] 
for the case of solid Si02 |18I15| . 

2.3. Time- dependent density functional theory 



Time-dependent density functional theory (TDDFT) is gaining increasing popularity as an efficient tool 
for calculating electronic excitations in finite systems, thanks to its simplicity and moderate computational 
cost. In TDDFT the exact polarization function satisfies a Dyson-like equation that reads 



XGG' (q,t^) = xgg' (q,^) + ["(q + ^") + /xc(q,G,G") xg"g' (q^^ 

G" 



(24) 



Eq. ([24| is similar to Eq.([7]) with an important addition, the xc kernel fxc that accounts for the exchange 
and correlation effects. The exact form of the xc kernel is unknown, but the TDDFT success relies also in 
the fact that even using the simplest adiabatic local density approximation (ALDA) for /xc a good accuracy 
in the evaluation of optical properties can be obtained [19 . 

We would like to mention here some common limitations of Eq. ((24)) . associated with the use of plane- waves, 
when applied to low-dimensional electronic systems. As yambo is a plane-wave code, it uses the super-cell 
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Energy [eV] 

Fig. 4. Calculated photo-absorption spectra at the ALDA level along the x-direction of an ethylene molecule using either the 
real-space real-time code Octopus |21| and yambo in configuration space (see text for detail). Inset; Contour plot of the electronic 
density of the ethylene molecule placed in a cubic super-cell of side 10 A. 



approach to study small systems like molecules and nanoscale materials. This approach usually leads to two 
types of problems. 

The first of these is the possibility of fictitious interactions between super-cell replicas (images). To treat 
this problem, yeimbo is capable of removing the long-range tail of Coulomb potentials following the method 
described in Ref . [20] . The second problem derives from the numerical instability of Eq. ((24|) that is induced 
by the presence of large regions of space with vanishing density, as occurs in large supercells (see inset of 
Fig. [3]) . Unfortunately, this vanishing density induces some complications to Eq. ([M]) when /xc is evaluated 
at the ALDA level. Indeed, the ALDA kernel is defined by: 

f^r^r, r'; t) = 5{v v') '^""f^"^ U^„(,,), (25) 



where ^'™^(?^) is the exchange and correlation potential of the homogeneous electron gas. In the region of 
space with vanishing density we have that /^^^ {n) — > oo so that the evaluation of the term involving the 
/xc in Eq. ([M]) cannot be directly calculated in reciprocal space, because the f^^^^ is not well defined. 
To overcome this problem yambo can solve the TDDFT equation within the ALDA in the basis of the e-h 
pairs, instead of in reciprocal space [Eq. (jM]) ]. In this case the TDDFT equation has the same form of the 
BS equation, Eq. p8)) . with the kernel 2 given by 

ss'k' ss'k' ss'k' 

with 

K^n^^^ =2 f f drdr>:k(r)0n'k(r)/x^,^°^(r,r';t = 0)ct>^,^,{r')ct>U,{r'). (27) 

In Fig. |4] we show, as an example, the ALDA dynamical polarizability along the x-axis of the ethylene 
molecule. The yambo result is compared with a calculation made with the code octopus [21] where all the 
quantities are defined in real-space, and the results are not affected by the presence of regions with zero 
density and of images due to the super-cell approach. We can see that yambo well reproduces the main 
excitation peak at 7.5 eV, that is composed by bound (localized) electron-hole states [22]. 
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Brillouin zone sampling size |q| [lattice units] 



Fig. 5. Left panel; matrix element of T,x at the X-point of a trans-polyacetylene one-dimensional polymer [23j in function of 
the number of k points used to sample the Brillouin zone, with and without the random integration method. We notice that 
the RIM removes the divergence of Tj^ as a function of the size of the BZ sampling grid. Right panel: /q (G = 0) integrals as 
a function of |q| for 40 k Brillouin zone sampling size. The RIM cures the non analyticity of the Coulomb integral near the F 
point. Far from F the RIM gradually approaches the bare integral. 

3. Some numerical aspects 

The yambo code has been applied to a wide range of materials, from bulk compounds to molecules and 
nanostructures. As a consequence we have developed several ad hoc methods specifically designed to solve 
physical and numerical problems that can be commonly encountered. 

3.1. The random integration method 

The definition of the self-energy operator — and of many other quantities — requires an integration in the 
BZ. In practice this integral is replaced by some suitable grid of points. Let's consider for simplicity only 
the Hartree-Fock self-energy, given by Eq.Q. Using a finite grid of transferred momenta q it reads 

K^. « ^ 5] i; (q + G) |p„™(k, q, G)| (28) 

m,q G 

In approximating S^j^ using a finite grid the key assumption is that the integrand of Eq.Q must be a 
smooth function of q. While this is generally true for the oscillators p part, this is not so trivial for the 
Coulomb potential that, indeed, diverges for q ^ 0. 

Nevertheless, in three-dimensional systems this divergence is not affecting the calculation. In fact, the 
phase space volume associated with the F point reduces as |qp when q — > and the divergence is de facto 
removed. In low-dimensional systems this cancellation holds only if a three-dimensional grid is used. When 
the Brillouin zone is sampled with lower-dimensional grid instability problems appear in the evaluation of 
the Coulomb integral. As shown in the right panel of Fig. [5l these instabilities may even grow as a function 
of the Brillouin zone sampling size. On the other hand using a three-dimensional grid is clearly inconvenient, 
and more importantly can make the calculations extremely cumbersome. 

In order to avoid the use of a three-dimensional sampling in low-dimensional systems yernibo offers two 
methods to remove the divergence arising from the Coulomb integrals. The first is based on a cutoff 
Coulomb technique which is discussed in detail in Ref.[20 . The other is the so-called random integration 
method (RIM) 24J. In the RIM the HF self-energy is rewritten as 
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m.q 



,(k,q,G)p/™(k-q) (2^II«(q 
The integral of the Coulomb potential 



(29) 



(30) 



is evaluated in a region i?r around the F point. Representing as i?q the region i?r translated in the general 



q position, this region is chosen in such a way that i?r U i?qj U i?q2 



.R, 



BZ. The RIM is based 



again on the uniformity ansatz described above, but restricted only to the pnm factors. The /q integrals are 
calculated via a three dimensional Monte Carlo technique [25] . 

For practical purpose, it is necessary to evaluate only /q (G = 0) because far from the origin the approxi- 
mation leading to Eq. is still accurate as shown in the right panel of Fig. [51 The importance of the RIM 
is exemplified in the left panel of Fig. (O where we show the convergence of with respect to the k-point 
sampling at the X point of a quasi one-dimensional irans-polyacetylene polymer. 



3.2. The Lanczos-Haydock solver of the BS equation 



Once the BS (or the TDDFT) Hamiltonian H [Eq. (|22p ] has been calculated in the basis of e-h pairs |wck), 
yambo offers two options for calculating the corresponding macroscopic dielectric function eA/(^) and related 
quantities (optical absorption, electron loss and dynamical polarizability): either via the diagonalization, or 
via the Lanczos-Haydock (LH) recursion method '26]. 

With the first option the program diagonalizes H using the standard BLAS/LAPACK routines [27] to 
find the eigenvalues E\ and eigenstates A^^k ^^^^ define the macroscopic dielectric function, Eq. (|23l) . With 
the second option Eq. ([^H]) is rewritten as, 



tM{oj)^l-{P\{Lo^H)-^\P), 



(31) 



where \P) = hmq^o |^bck)(uk - q|e"''i ''|ck). Then Eq. ^ is calculated using the LH method |26l28j . a 
general algorithm to compute the matrix elements of the Green's function {oj — H)~^ applied for the first 
time to solve the BS equation by Benedict and coworkers [29] . 

The LH algorithm recursively builds an orthonormal basis (Lanczos basis) in which H is represented 

as a real symmetric tridiagonal matrix. 



ai 62 











62 0,2 63 : 

■■. ■■. '■. 



(32) 



\0 ■■■ b, a, J 

The first vector of the Lanczos basis is set equal to The next vectors are calculated from the 

three-term relation 

bj+ilqj+i) = H\qj) - aj\qj) - 6j|(7j-i). (33) 
In the Lanczos basis Eq. (|3T]) becomes 

1 



ei,{^)^l-\\Pr- 



(w - ai) - 



(34) 



{lo - 02) 



bl 
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Fig. 6. Left top panel: Dynamical polarizability within the ALDA of the irans-azobenzene molecule calculated using either the 
diagonalization procedure (gray area) or the LH method varying the number of iterations. Left bottom panel: Relative error 
(A) of LH method with respect to the diagonalization varying the number of iterations. Right panel: CPU time ratio between 
the diagonalization procedure and the LH method (50 iterations) as a function of the e— h pairs included in the Hamiltonian. 

At each iteration the program computes a new vector \qj+i) [Eq. (|33p ]. the matrix elements aj, bj+i, and 
thus a better approximation e\j for eM from Eq. p4p . The LH iteration procedure stops when the difference 
\e\j — ejif^^^l in the given range of frequencies is smaller than a user-defined threshold. 

In practice when interested in a finite range of frequencies one needs a number of iterations k much smaller 
than the dimension N of the Hamiltonian (corresponding to number of e-h pairs included in Eq. I22p to get 
an accurate eM(^)- In the left panel of Fig. [6] this is exemplified for the dynamical polarizability calculated 
within the ALDA of the irans-azobenzene molecule: fc w 50 is enough for getting an accurate spectrum up 
to 6 eV, with TV « 4000. As a consequence, the LH method [0{kN'^) floating point operations] is usually 
much faster than the diagonalization procedure [0{N^) fioating point operations]. This is clearly shown 
in the right panel of Fig. [6] already for N w 1000 the LH method starts to be more convenient, and for 
N w 4000 it is about 70 times faster than the diagonalization procedure. The LH method is also more 
convenient in terms of memory since, in contrast with the diagonalization, only three vectors at a time need 
to be stored. Furthermore, the whole procedure has been efficiently parallelized in yambo. The LH algorithm 
is, therefore, the recommended method in yambo for the computation of the macroscopic dielectric function 
and related quantities, while the diagonalization should be used when the eigenvalues and eigenvectors of 
H are explicitly needed, as for example for plotting the excitonic wavefunction via the post-processing tool 

ypp- 

4. Overview of the software 

The structure of the yambo package can be separated into a number of stages, schematically depicted in 
Fig. [71 In the preliminary stage (I), a C/Fortran90 driver passes control to the main yambo executable, or to 
the data converters (a2y, p2y, e2y). The purpose of the latter is to generate the core databases that contain 
the ground state data necessary for starting the code. A mostly procedural data initialization stage (II) 
follows, where some general-purpose databases are prepared for later use. The main physical calculations 
are performed in the third stage (III) . Finally, databases created by yambo can be further manipulated using 
the post-processing tool ypp in a fourth stage (IV). 

Operation of the code follows a series of functionally distinct runlevels. The main runlevels are activated 
by the user via the command- line interface (described in more detail in Sec. 15. 2p . and others are called 
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automatically by the code where dependencies are present. Runlevels have a modular structure, in the sense 
that each one performs specific physical tasks and terminates (in most cases) with the creation of one 
or several databases written onto disk. These database files may then be accessed by different runlevels. 
Runlevels may be skipped in subsequent runs if a database is found that is compatible with the user 
requirements on execution. A more thorough discussion of the databases, and the yambo I/O in general, can 
be found in Sec. 15.31 

The main runlevels are now described in more detail. 

Stage I: 

(i) C driver: governs the actions taken by the rest of the code, yambo uses a standard getopt function [30j 
to parse the command line. This function is called by all executables to acquire the user-defined options 
passed at the command line. The syntax is described briefly in Sec. 15.21 

(ii) Data import/converter: the ground-state electronic structure of the system to be studied is imported 
from ground state codes (see Sec. 15. and converted into the core database files. 

Stage II: 

(i) User input: if command line options were added in Stage I, the executable acts as an input file generator. 
On execution, the code reads default values for input parameters from existing databases and updates 
the values in the input file, after which the code terminates. 

(ii) Data initialization: reorders G-vectors into spherical shells, calculates Fermi level and electronic occu- 
pations, sets up energy grids. 

(iii) Brillouin-zone sampling: expands k-points to full BZ, generates q-point meshes, checks on uniformity 
of grids. 

Stage III: 

(i) Hartree-Fock/Vxc: calculates the matrix elements of the Hartree-Fock exchange self-energy and of the 
DFT potential corresponding to the type of functional used in the ground state code run. 

(ii) Screening: calculates static and dynamical inverse dielectric functions, for use in the evaluation of the 
GW self-energy and of the BS/TDDFT kernel. 

(iii) Quasiparticle: calculates quasiparticle corrections to the Kohn-Sham band structure within the GW 
approximation. 

(iv) Linear response: optical properties within RPA and ALDA with and without local field effects. 

(v) Bethe-Salpeter/TDDFT: creation of the BS/TDDFT Hamiltonian and subsequent diagonalization 
using LAPACK routines or via Lanczos-Haydock iterative procedure. 

Stage IV: 

(i) Post-processing: contains routines for creating k-point grids, analyzing single-particle and excitonic 
wavefunctions and plotting the electronic wavefunctions and density. 

5. Description of the individual software components 
5.1. Main utilities 

As described earlier, the ysimbo package is in fact composed of three separate utilities. The first of these 
is the set of converters (a2y, p2y and e2y) which generate the yambo core databases from the output of 
other ground state codes. The a2y converter imports data from the so-called KSS file as generated by the 
Abinit codefTIP, while p2y imports data written by the Quantum-ESPRESSO/PWscf^ code in the iotk file 
format[31J. Alternatively, e2y can be used to import data from a netCDF formatted file written according 
to the ETSF file-format specifications [32j. As a set of high-level libraries are available [33] that are capable 
of reading and writing this format, it is relatively easy to interface ysunbo with other ground state codes. 

The main computation tool is the yarnibo executable itself. Depending on the command line options used 
(see next section), it can act either as an input file generator or as a straightforward serial or parallel 
executable. 

The last utility is the ypp post-processor, which is generally used to perform short analysis of pre-calculated 
databases. 
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Fig. 7. Simplified schematic chart of the program. Diamond boxes denote some of the most important databases (see text). 



5.2. Command line interface 



Thanks to the C/Fortran driver routine, yarnibo offers a very user- friendly command-hne interface for 
configuring the code at run-time and for creating and editing the main input file. An extended list of options 
can be displayed by using the -H option (see Table [1]). yambo options can be lowercase and uppercase. Upper 
case options may be added at run-time to specify input/output directories (-I/-0, see next Section), skip 
the MPI calls (-N), create a simple report of the existing database properties (-D), and so on. For instance, 
running 

ycunbo -N -I /scratch/tests/silicon-bulk -D 

will ask for a report of databases in the /scratch/tests/silicon-bulk directory, and force the parallel- 
compiled executable to run in serial. 

Lower case options, instead, drive the input file editor. For example, 

yeunbo -i -o c -I ~/tests/silicon-bulk 

will generate an input file (the default being yambo . in) that can be used to run the initialization steps (-i) 
and calculate linear response optics at the RPA level (-o c). A useful feature of the code is that default 
values for parameters are suggested based on their values in the most relevant database, if a value is not 
already present in the input file. Hence if 500 bands are written in the core databases, yambo will suggest a 
range of 1-500 bands for calculating the noninteracting response function. 

Since yambo offers a huge range of tunable parameters, some of which are quite technical, it would be quite 
daunting for an average user to face the full list of parameters when running the code. Hence, an input-file 
verbosity fiag (-V) lets the user decide how detailed the input file is to appear (the remaining parameters 
are set to their default values). After checking the existing databases and input files, yambo invoked in this 
way will redirect the user to edit the newly-created yambo . in file. 

The C parser used by yambo to process the input files is taken from the octopus [21] code. 

5.3. I/O: the yambo databases 

Files created and accessed by yambo are classified according to their purpose and identified by a specific 
prefix: 

(i) Static database files (prefix s . , otherwise known as the core databases) are generated in Stage I (see 
Fig. [7]) by the a2y, p2y and e2y converters. They contain the information concerning the geometry, 
basis set, wavefunctions and energies, etc. 

(ii) Stable database files (prefix db . ) are created by yambo at run-time, usually in Stage II. They are 
generally created once and used to store intermediate data designed to be re-read during subsequent 
executions. 

(iii) Job-dependent database files (prefix db . ) are also created at run-time but hold information that is 
usually specific to a particular runlevel (Stage III). 

(iv) Output files (prefix o . ) arc usually created at the end of certain Stage III runlevels and are intended 
for human use (e.g. suitable for plotting or inspection). 

If the netCDF libraries [34] have been configured for use, these database prefixes are further prefixed by n, 
e.g. ndb.kindx is the netCDF formatted version of db.kindx, the database containing the k/q-point grids. 
Users are strongly advised to use netCDF-linked executables, as the functionality of the code is enhanced. In 
addition there are auxiliary run-time report files (prefix r_) containing the run-time log information, while 
standard output can be redirected onto disk (prefix 1_). 

Creation of each database is controlled by its own particular Fortran subroutine, e.g. src/io/ioQP . F 
creates the database of QP corrections db.QP. However, the actual writing of data is performed at a low 
level by some common modules, in order to minimize problems associated with portability. Some of the most 
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Table 1 

Command line options for the various yambo tools. 



important databases and their respective runlevels are indicated in Fig. [7] as diamond-shaped boxes. 

In order to treat systems with large memory or disk requirements ygrnibo offers the capabihty of fragmenting 
the larger databases into chunks (-S). This functionality is most frequently utilized in the splitting of the 
ground-state wavefunction files according to k-points and/or bands, and in the division of the Bethe-Salpeter 
Hamiltonian according to the k-point index. 

By default, all databases are stored in the subfolder . /SAVE of the working directory. More specific control 
of I/O directories may be accessed through the various uppercase command line options. For instance, 
it is common to store the static databases in a 'core' directory (yambo -I COREPATH), so that they can 
be shared by different processes. Dynamically-created databases can then be placed elsewhere (yambo -0 
other). Furthermore, users may organize their work better according to a "job string" identifier (yambo 
-J JDBNAME), so that databases and output files are placed in a subdirectory with a name specified by 
the user. Finally, long jobs, like the creation or diagonalization of the Bethe-Salpeter Hamiltonian, place 
intermediate results in a RESTART folder, so that the job can be continued following a crash or if cpu time 
limits are exceeded. 

6. Installation instructions 

yambo makes use of the GNU autotools suite (automake/autoconf/libtool) for installation. Hence, the 
standard procedure of 

. / configure 
make all 
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should create executables for yambo, the ypp postprocessor and the basic Abinit converter a2y and place 
them in the bin/ folder of the yambo source directory. Detailed installation instructions are available in the 
manual, including how to enable the p2y and e2y converters, how to link with netCDF, BLACS and FFTW 
libraries, change installation options, and so on. In short, a somewhat complete list of compile-time options 
can be inspected by running the standard command 

./configure — help 

7. Running yambo: excitonic effects in bulk silicon 

In this section we outline the basic steps involved in a typical production run of yambo. Normally one 
would of course start by creating the core databases by importing the output data of some ground state code 
(see Sec. l5.1|) . For the purpose of illustration, however, we have included in the yambo source a pre-compiled 
set of core databases for bulk silicon in the doc/sample directory (these netCDF formatted databases can 
be extracted by following the instructions in the doc/scmiple/bulk_silicon/README file). 

Once the core databases (ns . dbl , ns . wf ) are extracted, running ysunbo in the doc/ sample/bulk_silicon/ 
directory produces as standard output: 

> yambo 

< > [01] Job Setup 

< > [02] Input variables setup 

< > [02.01] K-grid lattice 

<— > [02.02] RL shells 

<— > Shells finder | #################### | [1007,] —(E) —(X) 

< > [02.03] Input (E)nergies [ev] & Occupations 

< > [03] Transferred momenta grid 

<— > X indexes | #################### | [100"/,] —(E) —(X) 

<— > SE indexes | #################### | [100"/,] —(E) —(X) 

< > [04] Game Over & Game summary 

This is the Stage II run that ysunbo enters by default whenever, as in this case, no input file has been 
specified. The results of this setup run (BZ samphng indexes, G vector shells, etc.) are stored in two new 
database files, SAVE/ndb . gops and SAVE/ndb . kindx, that are to be re-used in subsequent runs. Besides the 
two databases, yarnibo writes a report file, r_setup, that contains a detailed list of information about the run 
and the system. Every runlevel generates an appropriate report file. 

At this stage we can carry out a more interesting calculation: as an example we will calculate the optical 
absorption spectra of bulk silicon including excitonic effects. To solve the BS equation we will employ the 
LH algorithm described in Sec. 13.21 First we create the input with the command-line based user interface. 
From Table [T] we see that the options to use are 

> yambo -b -o b -y h 

This command creates the yarnibo . in input file shown in Table [2] and opens it in the default text editor, 
vi. Note that, by reading the databases generated in the setup run, yambo already knows that there are 19 
momenta permitted by the BZ sampling, and that 50 bands were calculated in the preceding ground-state 
run. To perform a test calculation we change some of the values in the input. 

- In Eq. (ll7|) . we restrict the summation from band 2 to band 6: 
7. BSEBands 

2 16 1 # [BSK] Bands range 

7. 

- In the statically screened interaction we use just 51 RL vectors: 

BSENGBlk= 51 RL # [BSK] Screened interaction block size 

NGsBlkXs= 51 RL # [Xs] Response block size 
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optics 

bse 

emls 



# [R OPT] Optics 

# [R BSK] Bethe Salpeter Equation. 

# [R Xs] Static Inverse Dielectric Matrix 

# [R BSS] Bcthe Salpctcr Equation solver 

# [BSK] Resonant Kernel mode. ('x';'c';'d') 



BSresKmod= "xc" 
% BSEBands 
1 I 50 I 
% 

BSENGBlk= 1 RL 
BSENGcxx= 411 RL 
% QpntsRXs 
1 I 19 I 
% 

% BndsRnXs 
1 I 50 I 

% 

NGsBlkXs= 1 RL 
% LongDrXs 

1.000000 I 0.000000 I 0.000000 — # [Xs] [cc] Electric Field 

% 



# [BSK] Bands range 

# [BSK] Screened interaction block size 

# [BSK] Exchange components 

# [Xs] Transferred momenta 



# [Xs] Polarization function bands 

# [Xs] Response block size 



# [BSS] Solvers 'h/d/i/t' 

# [BSS] Energy range 

# [BSS] Damping range 

# [BSS] Energy steps 



BSSmod= "h" 
% BEnRange 
0.00000 I 10.00000 I eV 
% 

% BDmRange 
0.10000 I 0.10000 I eV 

% 

BEnStcps= 100 
% BLongDir 

1.000000 I 0.000000 I 0.000000 I # [BSS] [cc] Electric Field 

% 

Table 2 

yambo input file needed to perform a calculation of the optical properties of bulk silicon including excitonic effects (see text). 



- As the LH method is very fast we increase the number of points on the frequency axis by setting 
BEnSteps=1000. Moreover, to mimic the experimental width of the Ei and E2 peaks, we set 
7o BDmRange 

0.02000 I 0.8000 I eV # [BSS] Damping range 

*/. 

Calhng yambo again, without command line options, will start the calculation, which should last for some 
minutes. The progress of the calculation (including expected/elapsed time for each time-consuming op- 
eration) can be followed from the standard output, or in case of background execution, read from the 
l_optics_bse_emls_bss log file. 

At the end of the run several new files appear in the SAVE folder: the dielectric function ndb.emls, the 
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Fig. 8. Optical spectrum of bulk silicon. The result of the sample run described in the text is compared with a more converged 
calculation |38| and with the experiment |35| . Note that the result of the sample run has been red— shifted by 1 eV to simulate 
the correct QP gap that, instead, is calculated in the converged spectrum '38^. 

BS Hamiltonian matrix ndb.BS_Ql, and the information needed to restart the LH iterative procedure in 
ndb . Haydockjrestart. Furthermore, the run generates two files in the working directory, the report file 
(r_optics_bse_emls_bss) and the output file (o . eps_q001-bh). 

The output file contains the calculated spectra (BS and RPA spectra) that can be visualized with- 
out further processing with standard plotting tools. Plotting the second versus the first column of the 
o . eps_q001-bli file gives the well-known absorption spectrum of bulk silicon within BSE. The result of the 
run is shown in Fig. [51 where it is compared with the independent particle (RPA) spectrum (fourth versus the 
first column of the same file), with the experimental spectrum |35j (included in the doc/ sajiiple/bulk_silicoii/ experiment . d; 
file) and with the result of a more converged calculation |38] . 

8. Appendices 

8.1. The x° poles accumulation 

The noninteracting response function x'^ is a key ingredient of the yajnbo code. It enters, for example, in 
the definition of the RPA dielectric function and of the GW self-energy. Nevertheless the practical evaluation 
of x'^ is often a bottleneck in realistic calculations. The reason is that, as shown in Eq.®, x° contains a 
triple summation over the k points and the occupied and empty electronic levels. For systems with a large 
number of electrons this summation can easily reach millions of elements, that must be multiplied by the 
number of frequencies. 

To provide a tunable and efficient tool to reduce the numerical effort in evaluating x*' yambo follows the 
general idea of the algorithm proposed by Miyake [39] to rewrite x*^ as 




UJ + E„ + iO+ 



1 



1 



(35) 



p 



In Eq. psp the Ip are groups of e-h indexes (nn'k) with similar e-h energies e„'k — e„k-q- The latter are 
approximated with a single pole at Ep with occupation Fp. The number of groups created can be controlled 
by the user by tuning the CGrdSp variable in the input file. 
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The important difference witli Eq.® is that, in Ea. ([551) the evaluation of the oscillators is partially 
decoupled from the energy dependence. As a consequence by decreasing the groups of e-h pairs it is possible 
to make the evaluation of almost independent on the number of frequencies. 

8.2. Oscillator symmetries 

The oscillators 

p„„(k,q, G) = (nk|e*('i+G)--|mk - q) = / dr^k (O e*''i+°)-^V^„k-q (r) (36) 

appear in almost all the quantities calculated in yambo. They are evaluated using efhcient Fast Fourier 
Techniques (FFT) [JD] . Nevertheless yambo uses symmetry arguments to reduce the number of calls to the 
FFT interface. 

If we specify the symmetry operation by rewriting a general point in the BZ as k = KkiBZ, with \iiBZ 
defined in the irreducible wedge of the BZ, Eq. ((36|) reads 

p„„(k,q, G) = dr [e-«''--<^k... (r)] e^('i+«)- [e^^'(''^'>'---'''"""<.H'(k-q),,, (r)] , (37) 
with Go defined as Go = k — q — (k — q)/^^. Hence we find that 

p„„(k, q, G) = / dr <^,k.«. (r) e*'°-°°^"«™fl'(k-q),,, (r) • (38) 
Jn 

In case of spatial symmetries (a similar procedure applies to the time-reversal symmetry) we can rewrite 
the left side rotated wavefunction as WI^bMibz ^ "^nkjsz 

p„„(k,q,G)=/ dr<k..^(r)e'''"^''-''''^"^™fl-i«'(k-q),,,(r). (39) 
Jo 

Finally we notice that the symmetries constitute a group, and, consequently, R^^R' — S*, with S a symmetry 
operation. Thus, at difference with Ea. (j36|) . Eq.([39]) depends only on one symmetry index. By using Eq.([39]) 
the computational cost of calculating all the oscillators at a given k point is reduced by the number of 
symmetries in the star of k. 
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